COMBINE 2018 - Abstracts

Precise, scalable and compact visualization of rule-based models

James Greene and Michael Blinov

Rule-based approach allows representation and simulation of biological systems accounting for molecular features (molecular sites of binding and modifications) and site-speciﬁc details of molecular interactions. While rule-based description is very precise and can define very fine molecular details (like how phosphorylation status of a single residue in a multi-protein complex can affect affinity of another binding site of another protein within the same complex), it comes with a cost. Combining all the assumptions scribed in multiple rules into a single diagram is a daunting task that was not completed so far. Various visualization schemas were suggested, but they either consisted of a visual list of unconnected rules, or provided broad overview but lacked detail (like contact maps or extended contact map), or concentrated on certain aspects of rules mechanisms (like atom-rule graphs). Here we present the approach and software for precise, scalable and compact representation of rule-based models. It is based on the three basic concepts that allow scalability: molecular pattern (molecules that participate or affect the rule), rule center (molecular sites that are directly modified by a rule), and rule context (molecular sites that affect the rule). A birds-eye view of rules is provided by a diagram based on molecular patterns connected by rules that modify them. The next level of resolution is achieved when rule center is added, and the most detailed map is provided when rule context is shown. The detailed map allows a unique reconstruction of the BNGL file, without a need for any supporting documentation. The scalable approach for visualization is implemented in the Virtual Cell (VCell) modeling and simulation framework. We demonstrate here a couple of biological systems (early events in Epidermal Growth Factor (EGFR) receptor signaling and IgE receptor (FceRI) signaling) thoroughly specified in VCell. We also provide examples of a complete visualization in Systems Biology Graphical Notations (SBGN)-compliant Process Diagram (PD) conventions performed in yED editor, providing an approach for possible SBGN-PD extension for rule-based models.

Rendering Complex Genetic Design with DNAplotlib and pySBOL2

Sunwoo Kang, Thomas Gorochowski, and Bryan Bartley

Visualizing genetic circuits is essential in the field of synthetic biology. The Synthetic Biology Open Language Visual provides symbols and visual guidelines to represent complex genetic designs in a standardized way and facilitate clear and unambiguous communication of research. The first SBOL Visual standard focused on DNA components and sequence annotations, while the latest updates for version 2.2 incorporated features such as glyphs for non-DNA components, and the hierarchical interaction between modules. In this talk, I will present how I have been updating the Python package DNAplotlib during my Google Summer of Code project to support the following features: 1. Rendering of non-DNA components; 2. Defining modules and interactions between them; 3. Visualizing submodules within modules; and 4. Importing design data from CSV / SBOL2 compliant files. The talk will include a short introduction on how genetic circuits can be created, rendered and modified. This update hopes to provide a quick tool for visualizing genetic circuits that will be of great use to synthetic biology researchers.

Visualizing Metabolic Network Dynamics through Time-Series Metabolomics Data

Lea F. Buchweitz, James T. Yurkovich, Christoph M. Blessing, Veronika Kohler, Fabian Schwarzkopf, Zachary A. King, Laurence Yang, Freyr Jóhannsson, Ólafur Sigurjónsson, Óttar Rolfsson, Julian Heinrich, and Andreas Dräger

New technologies have given rise to an abundance of -omics data, particularly metabolomics data. The scale of these data introduces new challenges for the interpretation and extraction of knowledge, requiring the development of new computational visualization methodologies.

Here, we present a new method for the visualization of time-course metabolomics data within the context of a metabolic network map. We demonstrate the utility of this method by examining previously published data for two cellular systems—the human platelet and erythrocyte under cold storage for use in transfusion medicine. Our approach allowed for new insights into the state of the platelet metabolome during storage, most notably that nicotinamide secretion mirrors that of hypoxanthine and might, therefore, reflect similar pathway usage. These results are compiled into descriptive animation videos that demonstrate the dynamic metabolite concentrations across the entire metabolic network, as well as specific focus on individual pathways.

This method can be applied to any system with data and a metabolic map to help visualize and understand physiology at the network level. More broadly, we hope that our approach will provide the blueprints for new visualizations of other longitudinal -omics data types.

VisBOL enhancements for SBOL 2

Arezoo Sadeghi, Dany Fu, James Alastair McLaughlin, Zach Zundel, Anil Wipat, and Chris J. Myers

The initial release of VisBOL enabled synthetic biologists to visually present DNA constructs in SBOL visual format. Recent enhancements to this web based application have expanded the visualization capability to relay more complex information that occurs on and off the DNA strand. One of the features added was the ability to detect a "composite" DNA entity, a part that is made up of sub-designs. We have also added rendering of non-DNA entities such as protein, RNA, and small molecule. In doing so, VisBOL can now also show the user common biological interactions such as inhibition, production, degradation, activation and non-covalent-binding. These enhancements allows synthetic biologists to not only visualize their DNA constructs, but also show functional relationships between the non-DNA components and the genetic circuits.

The Tellurium Notebook Environment for Authoring Reproducible Models and Simulations

J Kyle Medley

Working with COMBINE standards such as SBML and SED-ML typically requires considerable technical knowledge of the standards and their encodings. Here, I present a flexible notebook platform for working with SBML, SED-ML and COMBINE archives. Tellurium automatically translates these standards into a human-readable format which is displayed in specialized notebook cells. The environment brings the advantages of Jupyter to community standards in systems biology and serves as a useful tool for modeling and teaching.

A Tool to Link and Establish Confidence in Evidence Used to Construct SBML Models

Kieran Alden, Paul Andrews, Becky Naylor, Andy Turner, Mark Coles, and Jon Timmis

Modelling is an increasingly used technique across the health and life sciences, pharmaceutical, agrochemical and personal care sectors, to explore mechanisms of action within a system and offer reduced R&D costs. However, model design decisions, such as assumptions and simplifications, can often be opaque to all but the model's author, and should be justified and supported by evidence used to inform that decision process. For consistent annotation and curation, authors of SBML models are encouraged to incorporate annotations that utilise unique and persistent links to resources in the Minimum Information Required in the Annotation of Models (MIRIAM) Registry. To further facilitate model communication, reusability, and curation, we have developed a new online tool that permits the location and attachment of MIRIAM identifiers and additional evidence sources to SBML model components. This tool exploits the web services provided by Identifiers.org to permit linking to entries in all current and future data collections. Each attachment, not limited to data-sets, parameter calculations, images, and justification text, has an accompanying value that indicates author confidence in that evidence. This process produces MIRIAM-Compliant models with accompanying overviews of evidence coverage and calculated summary statistics, that assist in determining overall model confidence and areas where design justification could be deficient. We propose our tool offers benefit in increasing model quality, auditability, and confidence in wider application, while reducing model maintenance effort and ensuring retention of IP within institutions. As an exemplar, we show application of the tool in attaching and understanding coverage of evidence for an SBML model of Quorum Sensing downloaded from BioModels.

Sub-SBML : A Subsystem Interaction Modeling Toolbox for SBML Models

Ayush Pandey and Richard Murray

We present Sub-SBML, a Python based toolbox to create, edit, combine, and model interactions among multiple SBML models. Sub-SBML works with a “subsystem” architecture of modeling where a single SBML model can be contained within a subsystem. Three major functionalities are developed in Sub-SBML that take advantage of this subsystem framework of modeling – creating subsystems, combining multiple subsystems, and modeling interactions such as transport of molecules and input-output relationships among multiple subsystems.

Sub-SBML provides functions to create subsystems for SBML models in order to use various utility functions to edit the model (such as renaming different lists of model components, creating multiple components at once, integrations for different simulator options, simulating variable species amount in a model etc). It is often desired in modeling of compounded biomolecular systems to combine multiple smaller models for different modules. The primary features of Sub-SBML allow for modeling of such systems by creating subsystems for different models and then combining these subsystems as desired, while also modeling the interactions among the models. A list of subsystems may be combined by specifying a list of species which would be shared in the final combined model. The toolbox also provides functions to model the cases when it is desired that all species that have the same name be a single combined species in the final model.

Sub-SBML can also be used to model compartments in SBML for modeling the transport of species between different models. A subsystem object in Sub-SBML consists of an SBML model that can have one compartment to hold all the species of that model. Multiple subsystems may be placed inside a “system”. A system object in Sub-SBML acts as a container for subsystems. Subsystems may be defined to be internal to a system, external to a system, or as membranes to a system. Hence, an SBML model of a system would consist of two compartments – internal and external. The membrane subsystem will also have these two compartments and reactions modeling the transport of species across it. A system model can be generated that would give a combined model of all subsystems that are internal, external and set as membrane to this system, hence modeling the transport across this system container. Similarly, multiple system objects can be modeled and interactions across multiple systems may be modeled as well.

We consider an example of IPTG transport across synthetic vesicles when α-hemolysin is expressed inside the vesicle, which creates pores in the membrane to allow transport of IPTG molecules from the environment to inside the vesicle. We demonstrate the use of Sub-SBML to model this system in SBML. We also consider other synthetic biological circuits to demonstrate input-output species interactions among subsystems and to obtain combined SBML models.

A Web-based CAD tool for the Synthetic Biology Open Language

James Alastair Mclaughlin, Irina Dana Ofiţeru, and Anil Wipat

Since the release of version 2.0 in 2015, the Synthetic Biology Open Language (SBOL) has been able to represent not just DNA sequences and their features, but also more complex engineered biological systems comprising proteins, RNAs, and biochemical interactions. However, tooling support for authoring many features of SBOL2 is still limited. For example, it is still not possible to create SBOL2 documents containing modules and interactions without manually writing code. This lack of tooling makes SBOL2 overly complicated to use and, consequently, inaccessible to many potential users.

We have developed ngBioCAD, a computer-aided design (CAD) tool for synthetic biology built on the SBOL2 and SBOL Visual standards. ngBioCAD is entirely Web-based, and uses an intuitive “drag and drop” approach to design common to CAD tools outside of synthetic biology. Features supported by ngBioCAD include hierarchical top-down design, specification of interactions, and sequence editing. The authors hope that ngBioCAD will both promote the adoption of the SBOL2 standard and serve as a useful tool for the design stage of the synthetic biology lifecycle.

Revealing the mode of action of Ras-Raf inhibitors by molecular dynamics simulations

Darex Vera Rodríguez and Carlos Camacho

Ras/Raf, a protein complex implicated in more than 30% of all human cancers, initiates signaling that can contribute to the development of tumors. Several Monocyte Chemoattractant Proteins (MCP) have been found to inhibit this complex by binding to Ras and thus deactivating Raf gene expression. This work is focused on determining the binding site of the Ras inhibitors using molecular docking and molecular dynamics (MD) simulations. MCP compounds were docked to the Ras crystal structure available in the Protein Data Bank (PDB) database (PDB ID: 4G0N) using SMINA. MD simulations were performed using AMBER between the predicted pose of the MCP compounds and Ras. The results indicate that the C-terminal complexes with the Ras inhibitors via several hydrophobic interactions and hydrogen bonds. These results can lead to the design of more potent Ras inhibitors which could prevent the development of human cancer tumors.

Representation of ME-Models in SBML

Marc Alexander Voigt, Lloyd Colton J., Laurence Yang, Zachary A. King, Oliver Kohlbacher, Kay Nieselt, and Andreas Dräger

Metabolism and Expression models (ME-models) [1] are a constraint-based modeling approach, which explicitly accounts for the cost of macromolecular biosynthesis. This allows ME-models to investigate genotype-phenotype relationships with a quantitative incorporation of '-omics' data. Common standards are a prerequisite for interoperability of systems biology tools. Novel approaches often require additional or changed data structures that cannot be represented in existing standards. This has been the case for ME-models. ME-models are powerful tools, but a lack of standards for encoding and creation, and the increased complexity of these models prevented widespread use.

SBMLme is an extension of current model encoding standards that enables SBML representations of ME-models that were reconstructed using COBRAme [2]. A prototype of the extension has been created in Java together with a standalone, bi-directional converter, between this extension and COBRAme's model storage format. The converter showed that SBMLme could fully and correctly encode a COBRAme model.

The use of SBMLme enables ME-models to be shared more efficiently, and a wider variety of tools to access ME-models. This should promote the propagation of ME-models. SBMLme may be used as a proof-of-concept for an official SBML package for ME-models. The standardization process requires ongoing discussion, the identification of a consensus SBML draft standard, and the implementation of the existing validation rules.

SBMLme is freely available at https://github.com/draeger-lab/SBMLme under the terms of the MIT license.

References:

[1] Ines Thiele, Ronan M. T. Fleming, Richard Que, Aarash Bordbar, Dinh Diep, and Bernhard O. Palsson. Multiscale Modeling of Metabolism and Macromolecular Synthesis in E. coli and Its Application to the Evolution of Codon Usage. PLOS ONE, 7(9):1-18, 09 2012.

[2] Colton J Lloyd, Ali Ebrahim, Laurence Yang, Zachary Andrew King, Edward Catoiu, Edward J O'Brien, Joanne K Liu, and Bernhard O Palsson. COBRAme: A Computational Framework for Models of Metabolism and Gene Expression. bioRxiv, 2017.

Quantifying Promoter Element Activation in Marchantia polymorpha Gemmae

Jeanet Mante, Mihails Delmans, and Jim Haseloff

Marchantia polymorpha is increasingly used as a model organism for synthetic biology. Here, we present an image library containing images of nuclear localised fluorescent proteins (proMpARF3:Venus and proMpYuC2:mTurquoise) and chlorophyll autofluorescence images for 153 gemma. These images were also used to test a computational tool for quantitatively describing fluorescently labelled nuclei distributions in M. polymorpha gemmae regardless of shape variations. The tool was used to show the spatial distribution of fluorescently labelled nuclei was significantly different between proMpARF3:Venus and proMpYuC2:mTurquoise validating the use of the tool.

Model-Based Prediction of Yersinia Enterocolitica Infection Outcome

Janina Geißert, Martin Eichner, Erwin Bohn, Reihaneh Mostolizadeh, Andreas Dräger, Ingo B. Autenrieth, Beier Sina, and Monika Schütz

The course and outcome of gastrointestinal infections are determined by the complex interplay of a given pathogen, its virulence and fitness factors, the host immune response and the presence and composition of the endogenous microbiome. An expansion of pathogens within the gastrointestinal tract implies an increased risk for the development of severe systemic infections, especially in patients receiving antibiotic treatment or in an immunocompromised state.

We aimed to get a deeper understanding of the complex relationships between pathogen, host, and microbiome. To predict pathogen expansion, gut colonization and infection outcome we employed a powerful measure of systems biology, i.e., the development of a computational model.For implementation and challenge of the model, oral mouse infection experiments with the enteropathogen Yersinia enterocolitica (Ye) were used. Our model calculates the bacterial population dynamics during gastrointestinal infection and accounts for specific pathogen characteristics, the host immune capacity and colonization resistance mediated by the endogenous microbiome. First, we performed model parameter optimization based on the experimental data we obtained by the infection of a healthy host. Afterward, we challenged our model by adopting scenarios where either a microbiome was lacking (mimicking antibiotic treatment of patients), or where the immune response was partially impaired. The predicted Ye population dynamics based on these scenarios could be approved in experimental mouse infections.

Our model provides new hypotheses about the roles of host- and pathogen-derived factors within this complex interplay and might be useful for future development of personalized infection prevention and treatment strategies.

FAIRDOMHub: a repository and collaboration environment for sharing systems biology research

Olga Krebs, Martin Golebiewski, Ulrike Wittig, Alan Williams, Natalie Stanford, Stuart Owen, Finn Bacall, Katy Wolstencroft, Jacky Snoep, Wolfgang Müller, and Carole Goble

The FAIRDOMHub (https://fairdomhub.org/) is a web-accessible repository for publishing FAIR (Findable, Accessible, Interoperable and Reusable) Data, Operating procedures and Models for the Systems Biology community. It enables researchers to organize, share and publish data, models and protocols, interlink them in the context of the systems biology investigations that produced them, and to interrogate them via API interfaces.

By using the FAIRDOMHub, researchers can achieve more effective exchange with geographically distributed collaborators during projects, ensure results are sustained and preserved and generate reproducible publications that adhere to the FAIR guiding principles of data stewardship.

Sharing COMBINE Archives using SynBioHub

Zach Zundel and Chris Myers

The COMBINE archive is a well-defined standard for sharing related data encoded using multiple standards. Most commonly, these standards are those supported by COMBINE, but arbitrary data can be included in such an archive. SynBioHub is a data repository under active development based on the SBOL (Synthetic Biology Open Language) standard. SBOL is a standard designed to encode the structure and function of genetic constructs. Though SynBioHub is designed to primarily support SBOL, it can store any linked data in its RDF triplestore, as well as other data as uploaded files attached to an SBOL object.

Functionality to upload and download COMBINE archives to and from SynBioHub has been implemented. SBOL files in the archive are inserted into a general-purpose SBOL Collection created for the archive. Other files are uploaded to the filestore, and linked to the SBOL Collection using the new SBOL Attachment TopLevel object. This functionality was validated using the iBioSim GDA tool and libSBOLj's SynBioHubFrontend.

SBOLExplorer: Data Infrastructure and Data Mining for Genetic Design Repositories

Michael Zhang

Biology is a very noisy field. Experiments are difficult to reproduce, the mechanisms behind life are not well understood, and data that we do obtain is difficult to make sense of. Much like traditional engineering fields where engineers draw from a library of reusable parts for their designs, experimental and synthetic biologists have designed biological circuits by drawing from a library of genetic constructs. However, these so-called genetic parts are poorly understood and are therefore limited in their usefulness. Additionally, there are hundreds of thousands of parts and sequences that have been either created or discovered. In this presentation, I will be filtering through this biological noise to provide genetic circuit designers a powerful way to search for and access the genetic parts that are useful to them.

This presentation will focus on SBOLExplorer, a system that is used to provide intuitive search within the SynBioHub genetic part repository. SynBioHub integrates genetic construct data from various sources and transforms and stores this data in the Synthetic Biology Open Language (SBOL), a standardized data model. By tackling the intricate data mining and data infrastructure problems associated with large scale unstructured and noisy data, the search, transformation, and storage of SynBioHub data can be enhanced. In particular, this presentation will focus on improving the usability of SynBioHub's search capabilities. By clustering SynBioHub's genetic parts into many derived collections, duplicate parts can be merged and diverse results can be shown. From there, a graph analysis algorithm can be designed to rank collections of parts by popularity and usefulness. Finally, data infrastructure challenges relating to indexing, storing, and serving need to be solved. The end goal is to integrate these findings into SynBioHub's data representation, search functionality, and data infrastructure.

Software Tools for Next-Gen Genetic Design

Nicholas Roehner, Lucy Qin, Dany Fu, and Douglas Densmore

In order to achieve full exploration of novel combinations of genetic parts, researchers need a mechanism to automate the process for creating, manipulating, and storing genetic designs. Our solution is an ecosystem of tools that express design spaces as graphs in a standardized, efficient, and retrievable manner. This ecosystem includes GOLDBAR, a user-defined combinatorial language to express biological design patterns, Constellation, a user interface for manipulating and enumerating designs, and Knox, which serves as a database of designs as well as a tool for large scale graph manipulation including operators such as merging and intersecting of designs. We’ve designed these as modular tools that have also been integrated to work together, utilizing community standards such as Synthetic Biology Open Language which allows for ease of integration with other tools in the synthetic biology community.

The Center for Reproducible Biomedical Modeling

Herbert Sauro, Michael Blinov, Dan Cook, John Gennari, Arthur Goldberg, Jonathan Karr, Ion Moraru, David Nickerson, and James Schaff

The Center for Reproducible Biomedical Modeling is a new NIH-funded center which aims to enable comprehensive predictive models of biological systems, such as whole-cell models, that can guide medicine and bioengineering. Achieving this goal requires new tools, resources, and best practices for systematically, scalably, and collaboratively building, simulating and applying models, as well as new researchers trained in comprehensive modeling. To meet these needs, the center will develop new technologies for comprehensive modeling, work with journals to provide authors, reviewers, and editors model annotation and validation services, and organize courses and meetings to train researchers to model systematically, scalably, and collaboratively.

FAIR data exchange in the life sciences by standardization of heterogenous data and multicellular models

Martin Golebiewski, Lutz Brusch, Haralampos Hatzikirou, and Wolfgang Müller

Given the increasing flood and complexity of data in life sciences, standardization of these data and their documentation are crucial. This comprises the description of methods, biological material and workflows for data processing, analysis, exchange and integration (e.g. into computational models), as well as the setup, handling and simulation of models. Hence, standards for formatting and describing data, workflows and computer models have become important, especially for data integration across the biological scales for multiscale approaches.

To this end many grassroots standards for data, models and their metadata have been defined by the scientific communities and are driven by standardization initiatives such as the Computational Modeling in Biology Network (COMBINE). To fill gaps in domains that currently have a lack of standardization, as for example the field of multicellular and multiscale modelling, we define together with our partners new modelling standards such as MultiCellML as a new exchange format for multicellular models.

Moreover, for integration of data and models standards have to be harmonized to be interoperable and allow interfacing between the datasets. We drive and lead the definition of novel standards of the International Organization for Standardization (ISO) in the technical ISO committee for biotechnology standards (ISO/TC 276) in order to define a framework and guideline for community standards and their application. With our activities we aim at enhancing the interoperability of community standards for life science data and models and therefore facilitating complex and multiscale data integration and model building with heterogenous data gathered across the domains. This is accompanied by activities that build a bridge between stakeholders and by our activities to develop the means and channels for transferring information about standards between them, such as the NormSys registry for modelling standards (http://normsys.h-its.org).

Harmonizing semantic annotations on computational models in biology

Neal ML, König M, Nickerson D, Mısırlı G, Kalbasi R, Dräger A, Atalag K, Chelliah V, Cooling M, Cook DL, Crook S, de Alba M, Friedman SH, Garny A, Gennari JH, Gleeson P, Golebiewski M, Hucka M, Juty N, Le Novère N, Myers C, Olivier BG, Sauro HM, Scharm M, Snoep JL, Touré V, Wipat A, Wolkenhauer O, and Waltemath D

Life science researchers use computational models to articulate and test hypotheses about the behavior of biological systems. Semantic annotation is a critical component for enhancing the interoperability and reusability of such models as well as for the integration of the data needed for model parameterization and validation. Encoded as machine-readable links to knowledge resource terms, semantic annotations describe the computational or biological meaning of what models and data represent. These annotations help researchers find and repurpose models, accelerate model composition, and enable knowledge integration across model repositories and experimental data stores. However, realizing the potential benefits of semantic annotation requires the development of model annotation standards that adhere to a community-based annotation protocol. Without such standards, tool developers must account for a variety of annotation formats and approaches, a situation that can become prohibitively cumbersome and which can defeat the purpose of linking model elements to controlled knowledge resource terms. Currently, no consensus protocol for semantic annotation exists among the larger biological modeling community. Here, we report on the landscape of current annotation practices among the COmputational Modeling in BIology NEtwork (COMBINE) community and provide a set of recommendations for building a consensus approach to semantic annotation.

Availability of Code from Published Computational Physiology Models

Graham Kim and John Gennari

Introduction:

Public model repositories, such as the Physiome Model Repository (PMR)1 and BioModels Database2, are vital resources for accessing computational models of biological processes. There is broad recognition that these resources provide value for scientists aiming to build from others’ works. Both repositories are manually curated: Curators identify a publication with a model, and then work to develop and establish reproducibility of the code. These methods do not scale well with the pace of publication. Without these third-party curators, what percentage of models described in the literature are reproducible? More fundamentally, we investigate what percentage of publications about models include some access to, or information about the model code.

Methods: In order to elucidate the status of model availability and reproducibility in literature, we conducted a scoping review to characterize computational physiology models in literature. We looked at whether or not (a) the model code is available, (b) the modeling language used is stated, and (c) the equations and parameters used in the model are listed. We examined three categories of model publications, beginning from broad and going to narrow. First, using a combination of MeSH terms, we searched PubMed for computational physiology models broadly—this resulted in over 6,500 publications, of which only a fraction was relevant. Next, we searched more specifically for cardiovascular models in PubMed—which returned over 1,000 publications. Finally, examined diabetes model publications identified in a review article3 as Clinical Models—which was a list of 96 models. From each of the three categories, we randomly sampled and screened publications for inclusion. We analyzed the content of 50 full-text publications from each of categories for model code availability.

Results: All but one model publication examined had no model code available (one publication referred to an external link that is no longer available). Furthermore, most model publications in the general and cardiovascular categories only listed a subset of equations and parameters used for model simulation. More than a third of model publications in the diabetes category only list a subset of the model equations and parameters. In this third category, about 7% of the models were included in BioModels library, but all of these were added and curated after publication. Thus, even for publications with curated models, a scientist simply reviewing the literature would have no easy way of finding these model codes.

Conclusion: Despite the push towards reproducibility of computational models, the vast majority of model publications do not provide sufficient information to reproduce the model simulations they describe. At a minimum, modelers and authors should indicate that source code is available along with some information about the language used. Better, modelers should include sufficient information in their publications so that the models can be reproduced and simulated by others; ideally, this would include submitting their code into repositories such as BioModels Database or the PMR. Moving forward, one approach to ameliorate this situation would be if journals and publishers require or at least encourage authors to submit model source code and all of the equations and simulation parameter as part of the supplemental material for the publication.

1. Yu T, Lloyd CM, Nickerson DP, Cooling MT, Miller AK, Garny A, et al. The Physiome Model Repository 2. Bioinformatics. 2011 Mar 1;27(5):743–4.

2. Li C, Donizelli M, Rodriguez N, Dharuri H, Endler L, Chelliah V, et al. BioModels Database: An enhanced, curated and annotated resource for published quantitative kinetic models. BMC Syst Biol. 2010 Jun;4:92–92.

3. Ajmera I, Swat M, Laibe C, Le Novère N, Chelliah V. The impact of mathematical modeling on the understanding of diabetes and related complications. CPT Pharmacomet Syst Pharmacol. 2013 Jul 10;2:e54.

Modeling of Potentially Virulence-Associated Metabolic Pathways in Pseudomonas aeruginosa PA14 Including Experimental Verification

Alina Renz, Erwin Bohn, Monika Schütz, and Andreas Dräger

According to the report of the ‘Antimicrobial resistance surveillance in Europe (2015),’ P. aeruginosa is an opportunistic, human pathogen that causes many infections in hospitalized patients with immune defects or impairments. Since it is difficult to control P. aeruginosa in hospitals, it can cause hospital-acquired pneumonia [1]. Some strains of P. aeruginosa seem to be associated with higher mortality than others. With the help of a published genome-scale model of PA14 [2] and sequencing data of a highly pathogenic patient strain that was recently isolated in Tübingen, metabolic differences between the laboratory and patient strain shall be identified and subsequently verified in a laboratory experiment.

First, differences in the sequences of the two strains were identified by performing SNP analysis. These differences were then used to find metabolic alterations affecting the virulence. By using FBA and in-silico gene knock-out, three genes were identified that could be responsible for the difference in pathogenicity between the laboratory and the highly pathogenic patient strain. These genes affect metabolic reactions that are associated with virulence in P. aeruginosa. Gene knock-outs of these three genes are momentarily being performed in a laboratory experiment to verify their metabolic relevance in virulence.

However, only a fraction of the genes of P. aeruginosa is included in the published model. Many additional genes are associated with virulence, differ between the sequenced laboratory and patient strains, or both. Therefore, the model is being extended to increase its predictive value.

References:

[1] European Centre for Disease Prevention and Control. Antimicrobial resistance surveillance in Europe 2015. Annual Report of the European Antimicrobial Resistance Surveillance Network (EARS-Net). Stockholm: ECDC; 2017. https://ecdc.europa.eu/en/publications-data/antimicrobial-resistance-surveillance-europe-2015#no-link

[2] Bartell JA, Blazier AS, Yen P, Thøgersen JC, Jelsbak L, Goldberg JB, and Papin JA. Reconstruction of the metabolic network of Pseudomonsas aeruginosato interrogate virulence factor synthesis, Nature Communications (2017). doi:10.1038/ncomms14631

Semantics-based model discovery and assembly for renal transport

Dewan Sarwar, Reza Kalbasi, Koray Atalag, and David Nickerson

Biophysically-based computational models have great potential to make significant contributions to many clinical applications. To help realise this potential, we have been developing tools to help scientists and clinicians discover existing models that are relevant to their needs and to then comprehend the capabilities of such models prior to their assembly into novel model-driven projects. We have comprehensively annotated an initial cohort of renal epithelial transport models with biological semantics, including knowledge such as protein identifiers, anatomical locations, and solutes transported. These annotations are deposited in the Physiome Model Repository (PMR) to ensure they are accessible, persistent, discoverable, and resolvable. We have developed a web-based tool which enables users to discover models relevant to the questions and hypotheses they are investigating and then semantically assemble models. In addition to model discovery and assembly, this tool provides visualisation of the biological semantics and guided graphical editing of a model description. Model editing is aided by a recommender system which guides users to relevant models in the repository, using standard bioinformatics services to help rank recommended models.

The semantic annotation and modelling platform we have developed is a new contribution enabling scientists and clinicians to discover relevant models in the PMR and reuse them in other computational modelling initiatives and translational projects linking to real-world health data. We believe that this approach demonstrates how semantic web technologies and methodologies can contribute to biomedical and clinical research. Furthermore, novice modellers could use this platform as a learning tool. The source code is available on GitHub: https://github.com/dewancse/epithelial-modelling-platform

Metabolic Network Reconstruction of Treponema pallidum spp. pallidum

Silvia Morini, Isabella Casini, Thomas M. Hamm, Kay Nieselt, and Andreas Dräger

Background: Since the discovery, made in 1905 by Schaudinn and Hoffmann, of Treponema pallidum ssp. pallidum as the etiologic agent of syphilis, medicine has made significant progress against this disease. Yet, despite the availability of diagnostic tests and a therapy based on antibiotics, the world has not stopped being burdened by syphilis, that has been re-emerging globally over the last few decades (WHO Report, 2008), for which no vaccine is still available and which, moreover, in its early stages enhances the transmission of HIV. Continuous in vitro culture of this organism has still not been achieved, imposing a substantial roadblock to its experimental inspection, and even the sequencing of its genome (Fraser et al., 1998) did not yield an obvious solution to the cultivation problem. While much has already been tried on the laboratory bench, this pathogen has still not (to our knowledge) been tackled using a systems biology approach.

Results: Here, we present a first manually curated draft reconstruction of the metabolic network in Treponema pallidum ssp. pallidum towards a genome-scale metabolic model (GEM). At this time, the model iSM161 comprises 161 genes of 1,039 predicted open reading frames that are responsible for 239 reactions of 277 metabolites. The model is still under development and steadily updated. For the reconstruction, COBRApy has been used, where subsystem information is added and parsed as SBML groups extension using libSBML.

Discussion: Using this reconstruction, together with the application of COBRA methods, we anticipate to gather new insights into the pathogen’s physiology and pathology, and in how this spirochete, which has earned the designation of “stealth pathogen,” succeeds in making a living and eluding human’s immune defenses as well as cultivation attempts. It is planned to make the model available to the community in SBML format.

cyTRON/JS: A Cytoscape.js based application for the inference of cancer progression models

Lucrezia Patruno, Marco Antoniotti, and Alex Graudenzi

With the advent of Next Generation sequencing technologies, the need for new algorithmic methods to study the great amount of data produced arised, and this has had a great impact on cancer progression study.

TRONCO is an R package for the inference of the evolutionary history of cancer mutations, which was developed by the BIMIB group at University of Milan-Bicocca. It includes different functions, which guide the user through the process of creating an analysis starting from mutation data. The work presented here concerns the development of cyTRON/JS, a web application which serves as a bridge between JavaScript and R TRONCO functions. It was conceived for two main purposes:

- Providing an interactive visualization of TRONCO models: while TRONCO graph display is static and cannot be modified, cyTRON/JS provides an interactive view of graphs, making it possible to directly retrieve information about genes involved in the study, by accessing widely-used public genome databases, and about the algorithms which led to the model reconstruction. The visualization is based on Cytoscape.js. - Making TRONCO accessible to users not familiar with R programming. The application is made up of two main modules: one for data input and parameters setting and the other for model reconstruction, which is completely transparent to users. The communication between the modules is based on standard data exchange protocols and data formats.

cyTRON/JS was developed within the Google Summer of Code 2018 program, in collaboration with the National Resource for Network Biology (NRNB) organization.

A Tool to Link and Establish Confidence in Evidence Used to Construct Computer Simulations

Kieran Alden, Paul Andrews, Becky Naylor, Andy Turner, Mark Coles, and Jon Timmis

Modelling is being increasingly used to explore mechanisms of action within a system and offer reduced R&D costs. However, model design decisions, such as assumptions and simplifications, can often be opaque to all but the model's author, and should be justified and supported by evidence used to inform that decision process.

Our online platform permits the attachment of sources of evidence, including data-sets, statistical calculations, images, and text, to model file components. Each attachment is accompanied by a value indicating confidence in that evidence, retaining the reasoning for model composition while identifying potential areas of deficiency.

Our tool is compatible with a range of model specification formats, including MATLAB, XML, and SBML. For SBML models, our platform permits location and attachment of resources in the Minimum Information Required in the Annotation of Models (MIRIAM) registry, using services provided by Identifiers.org

Overviews of evidence coverage and summary statistics are produced to assist in determining overall model confidence. Platform adoption offers benefit in increasing model quality, auditability, and confidence in wider model application, while reducing model maintenance effort and ensuring retention of IP within institutions.

Python Based Standardization Tools for ClinicalTrials.Gov

Jacob Barhak

ClinicalTrials.Gov is a government database that stores clinical trials from around the world. This database is growing fast, partially because some requirements of reporting clinical trial results are now supported by U.S. law. Despite the growth of database, the data stored there is still not widely used for modeling and simulation, partially because the database is still a relatively new tool, and partially because the data there is not standardized. Since data is entered by multiple entities into this semi-structured text based database, and since clinical trials have a large variety, the data is not immediately suitable for modeling. Although the National Library of Medicine scrutinizes this data, the scrutiny level is for human comprehensible data, while modeling requires computer comprehension.

For this reason, there is a need to clean data towards tasks such as disease modeling. This work will discuss a set of Python tools that were used to process ClinicalTrials.Gov and aim towards standardization. The tools parse XML data, index the information for easier processing, and then uses Machine Learning and Natural Language Processing to cluster the data, The clustering algorithm is used to assist a human user look at similar information provided by a Graphical User Interface.

These Python tools were used to extract 21,094 units from 30,763 different clinical trails with results. These quantities show a clear need of standardization to be used in future computer modeling efforts.

SED-ML extension for parameter fitting and uncertainty estimates

Keeyan Ghoreshi, James Greene, Daniel Fryer, Kevin Brown, Ryan Gutenkunst, and Michael Blinov

Research involving detailed dynamical computational models of bio-molecular networks is growing explosively. However, such models are generally highly parameterized, and assigning parameter values is often a limiting step in model development. Direct measurements of reaction rate constants and molecular concentrations are often unavailable or unreliable. Thus, dynamic modeling relies on fitting of model parameters. Furthermore, quantitative predictions can be made meaningful only if they are accompanied by well-founded uncertainty estimates.

Here, we present an XML extension to Simulation Experiment Description Markup Language (SED-ML) that enables description of simultaneous fitting of several models and data sets, and describing uncertainty in the fitting. An extended SED-ML file describes several models that are simultaneously fitted to one or more data sets. Several new elements are added to SED-ML schema: (a) Cross-reference between models and data; (b) Several new tasks (such as optimization and ensemble simulations), with multiple layers added within each task, such as the parameters layer under optimization task, or histograms and subplots under ensemble simulations; (c) saving the set of the system, necessary to avoid extensive repeat of optimization if only post-processing is required. The simplified version of the extension is implemented as an input for the SloppyCell software (Gutenkunst, R.N., et al. 2007, PLoS Comp Biol) for exploring uncertainties in both model parameters and model predictions.

Tellurium 2.0: An Extensible Python-based Modeling Environment for Systems Biology

Kiri Choi, J Kyle Medley, Matthias König, Lucian Smith, Stanley Gu, Joseph Hellerstein, Stuart Sealfon, and Herbert Sauro

Here we present the next iteration of Tellurium, a Python-based environment for model building, simulation, and analysis that facilitates reproducibility of models in systems and synthetic biology. Tellurium is a modular, cross-platform, and open-source simulation environment composed of multiple libraries, plugins, and specialized modules and methods. Two interfaces are provided, one based on the Spyder IDE which has an accessible user interface akin to MATLAB and a second based on the Jupyter Notebook, which is a format that contains live code, equations, visualizations, and narrative text. Tellurium uses libRoadRunner as the default SBML simulation engine which supports deterministic simulations, stochastic simulations, and steady-state analyses. Tellurium includes languages such as Antimony and phraSED-ML and comes with tools such as SED-ML/COMBINE archive to Python translator to ensure reproducibility and exchangeability. By combining multiple libraries, plugins, and modules into a single package, Tellurium provides a unified but extensible solution for biological modeling and analysis for both novices and experts.

An improved model generation method using Cello’s optimized parameters

Pedro Fontanarrosa, Göksel Misirli, Tramy Nguyen, Timothy S. Jones, Anil Wipat, and Chris Myers

A new method has been developed to integrate changes from the Virtual Parts Repository (VPR) into the genetic modeling pipeline using the iBioSim tool. This method was appropriated to catch the new changes in the Cello software SBOL output, and thus changes had to be performed along the pipeline to accommodate this new standard. Furthermore, we are now developing a new kind of dynamic modeling for iBiosim. This new model generation will take in parameters values from previously optimized ones by Cello, and model them using a dynamic modeling system where the time response function for the circuit can be simulated. This could further help us predict metrics (i.e: RNAseq data, Riboseq data) to be compared with experimental results and debugging of genetic circuits.

MutliCellDS: Integration with other Standards

Samuel Friedman

In the past few years, MutliCellDS has matured and is facing three new challenges: 1) As the I/O format for multicellular simulators, we are looking at ways to integrate in data from other standards such as SBML and CellML to better represent the models involved in our simulations, but we do not have ways of easily combining multiple standards together; 2) Since MultiCellDS can also be used to record experimental data, we are working on creating translation software between MultiCellDS files and ISA-TAB to perform better data fusion between multiple modalities of human cancer data; 3) Recently, we have started working on establishing a Common Coordinate Framework (CCF) to record cellular location in the human body and using MultiCellDS' properties to create a mapping between biological data and FAIR data storage, which requires us to ensure that we are using a open community established methodology for annotating both biological data and models, an ongoing COMBINE theme. These new challenges serve as a plan for the ongoing evolution of MultiCellDS.

Do predictions from protein structure improve clinical interpretation of genetic variants?

Louis Gil, Nicolas Lenz, Diego Quezada, and Melissa Cline

Understanding genomic Variants of Unknown Significance (VUS) provides insights into the identification and treatment of many diseases, including breast cancer. Because HBOC (Hereditary Breast and Ovarian Cancers) are often characterized by pathogenic mutations in BRCA1 and BRCA2, these mutations can be used as hallmarks to identify risk or causes of these cancers. Interpreting variants is done by the use of clinical data supporting the risk probability, but retrieving clinical data may not always be simple; therefore, the use of in-silico predictors is often helpful when no clinical data is available. In this study, we compare and benchmark general predictors working in a genetic level to predictors that use protein structure to observe cases where the protein-based analysis improves interpretation over the general predictors. The general in-silico predictors we used are MutPred, VEP, CRAVAT (VEST), and protein structure predictors FoldX, Heat, and a threshold on relative solvent accessibility. FoldX works by predicting how an insertion of a mutation affects the protein and shows where the mutation is located which could indicate a correlation to pathogenicity. These algorithms are then validated with the use of clinically interpreted data as analyzed by ClinVar and ENIGMA (Evidence-based Network for Investigation of Germline Mutant Alleles). This approach may allow us to observe significance not yet validated by clinical data that may be a factor in the understanding of the cancer and risk. Protein structure may provide additional data in order to validate and enrich our genomic interpretation.

Computational analysis of the disease-causing mutations in Phosphodiesterases

Vidhyanand Mahase

Phosphodiesterases (PDEs) are a group of enzymes that catalyze cyclic adenosine monophosphate (cAMP) and cyclic guanosine monophosphate (cGMP), which are important second messengers in signal transductions. The catalysis of cAMP and cGMP by PDE is very important in drug discovery because of the various signaling pathways regulated by these second messengers. For example, PDE4 is responsible for the hydrolysis of the cAMP, downstream activation of protein kinase A, and subsequent phosphorylation of the transcription factor cAMP-response element binding protein. Because the activation of this pathway modulates gene transcription of numerous cytokines, disruption of PDE4 results in suppression of Tumor necrosis factor α production and eventual inhibition of their proinflammatory and destructive properties, leading to mental disorders including schizophrenia and major depression. Several missense mutations have been identified in PDEs. These mutations reduce the binding efficiency of rolipram, a prominent PDE4 inhibitor, and result in continuous activation of cAMP and/or cGMP downstream signaling. In this study, we applied a computational approach to investigate the damaging mutations in PDEs. The FoldX was used to determine the energetic effects of point mutations on protein stability and protein-protein interaction. The energy calculation revealed that the disease-causing mutations could reduce folding energy and affect binding energy. The results suggest that the bioinformatics analysis can provide useful information for understanding the roles of coding mutation in the development of complex disorders.

Technology Mapping for Asynchronous Genetic Circuit

Tramy Nguyen, Timothy Jones, Prashant Vaidyanathan, Douglas Densmore, and Chris Myers

Most electrical circuits utilize a timing reference to synchronize the progression of signals and enable sequential memory elements. These designs may not be realizable in biological substrates due to the lack of a reliable clock signal. Asynchronous designs eliminate the need for a clock with dual-rail input encoding and signal receipt acknowledgement handshake protocol. We propose a workflow to automate the synthesis of asynchronous genetic circuit designs.